Fix crash on continue after a missing package non-critical error #5942

M4rtinK · 2024-10-16T12:16:16Z

Due to the way how exceptions propagate from the Anaconda installation tasks we ended up with a non-critical error exception interrupting the iteration of an important task iteration loop installing the payload.

Due to not being handled "deep enough" but instead "bubbling up" to a top level error handler the loop apparently gets interrupted & remaining tasks skipped.

This resulted in an unrelated crash ("kernel version list no available") due to all the important installation tasks being skipped, packages not being installed and installation related data not being populated.

The end result was that a non-critical error (such as a missing package) would trigger a dialog asking the user to quit or continue - but clicking "continue" would result in a weird crash.

So move the error handler check closer to the task execution, to prevent the loop from being interrupted. That way the loop will resume its iteration if "continue" is clicked in the UI.

Also convert the non-critical error if it gets raised (user deciding not to continue after the non-critical error) to a fatal one. This is necessary, as otherwise the top level error handler would get triggered, asking the user again to quit or continue.

NOTE: Longer term we really should clean this up & have all installation tasks gathered, ordered and executed from a single place. Then all the error handling could be in a single place, making things much simpler.

(cherry picked from commit 73b412f)

Resolves: RHEL-57699
Resolves: INSTALLER-4045
Related: rhbz#2238045

Backport of #5937 to the RHEL 10 branch.

Due to the way how exceptions propagate from the Anaconda installation tasks we ended up with a non-critical error exception interrupting the iteration of an important task iteration loop installing the payload. Due to not being handled "deep enough" but instead "bubbling up" to a top level error handler the loop apparently gets interrupted & remaining tasks skipped. This resulted in an unrelated crash ("kernel version list no available") due to all the important installation tasks being skipped, packages not being installed and installation related data not being populated. The end result was that a non-critical error (such as a missing package) would trigger a dialog asking the user to quit or continue - but clicking "continue" would result in a weird crash. So move the error handler check closer to the task execution, to prevent the loop from being interrupted. That way the loop will resume its iteration if "continue" is clicked in the UI. Also convert the non-critical error if it gets raised (user deciding *not* to continue after the non-critical error) to a fatal one. This is necessary, as otherwise the top level error handler would get triggered, asking the user again to quit or continue. NOTE: Longer term we really should clean this up & have all installation tasks gathered, ordered and executed from a single place. Then all the error handling could be in a single place, making things much simpler. (cherry picked from commit 73b412f) Resolves: RHEL-57699 Resolves: INSTALLER-4045 Related: rhbz#2238045

M4rtinK · 2024-10-16T12:39:53Z

/kickstart-test --testtype smoke

M4rtinK · 2024-10-17T12:23:35Z

/kickstart-test --testtype smoke

M4rtinK · 2024-10-17T21:11:49Z

/kickstart-test --skip-testtypes whatever

M4rtinK · 2024-10-18T10:26:50Z

Wow, all 120 test passed with 5 retries - seems to be working, thanks @rvykydal! :)

I've also checked the results c for retried tests both sets of results are there, so if we wanted to look for race conditions, we still can do that even with retries on. )

jkonecny12

Looks good to me.

KKoukiou · 2024-10-22T10:49:27Z

pyanaconda/payload/migrated.py

+                        # This results in a nice error dialog with "Exit Installer" button
+                        # being shown.
+                        raise PayloadInstallationError(str(e)) from e
+                    else:


Hm I dont get why we need to check fro ERROR_RAISE, handle broad Exception and raise again.
If we dont handle it it will be just propagate up. What am I missing here?

I mean convert this to::

try: sync_run_task(task_proxy) except NonCriticalInstallationError as e: # Handle the non-critical error by raising it as a fatal PayloadInstallationError raise PayloadInstallationError(str(e)) from e

M4rtinK added the rhel-10 label Oct 16, 2024

jkonecny12 approved these changes Oct 21, 2024

View reviewed changes

KKoukiou reviewed Oct 22, 2024

View reviewed changes

M4rtinK added the ready to merge The PR can be merged. It should have all BZ flags required for releasing set (usually release+). label Oct 22, 2024

M4rtinK merged commit bb02888 into rhinstaller:rhel-10 Oct 23, 2024
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix crash on continue after a missing package non-critical error #5942

Fix crash on continue after a missing package non-critical error #5942

M4rtinK commented Oct 16, 2024

M4rtinK commented Oct 16, 2024

M4rtinK commented Oct 17, 2024

M4rtinK commented Oct 17, 2024

M4rtinK commented Oct 18, 2024

jkonecny12 left a comment

KKoukiou Oct 22, 2024

Fix crash on continue after a missing package non-critical error #5942

Fix crash on continue after a missing package non-critical error #5942

Conversation

M4rtinK commented Oct 16, 2024

M4rtinK commented Oct 16, 2024

M4rtinK commented Oct 17, 2024

M4rtinK commented Oct 17, 2024

M4rtinK commented Oct 18, 2024

jkonecny12 left a comment

Choose a reason for hiding this comment

KKoukiou Oct 22, 2024

Choose a reason for hiding this comment